Skip to content

Conversation

@dtronmans
Copy link
Contributor

Purpose

There are unfortunately well-known issues with ImageNet classes when it comes to duplicate names (https://gist.github.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57) in this PR introduce manually fixing duplicate class names, in a similar manner that the COCO parser fixes duplicate images.

It is also not possible as of now to have the original mapping of classes, and after parsing the dataset we are left with the current (re-ordered) mapping, so I added a way of retrieving it.

It is possible that some users would like to work with the datasets even with the underlying issues, for example having duplicate class names. In this case I added the --no-clean flag option to parse the original datasets without changes.

Specification

Dependencies & Potential Impact

Deployment Plan

Testing & Validation

Tested locally both by me and @ptoupas, I tagged him in this PR too

@dtronmans dtronmans requested a review from a team as a code owner January 13, 2026 12:58
@dtronmans dtronmans requested review from conorsim, klemen1999, kozlov721 and tersekmatija and removed request for a team January 13, 2026 12:58
@github-actions github-actions bot added fix Fixing a bug data Changes affecting luxonis_ml.data subpackage CLI Changes affecting the CLI labels Jan 13, 2026
@dtronmans dtronmans requested a review from ptoupas January 13, 2026 12:58
Copy link

@ptoupas ptoupas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I've tested the changes both on COCO-2017 and imagenet-sample and they work as expected.
I left a minor comment regarding naming conventions for the imagenet dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLI Changes affecting the CLI data Changes affecting luxonis_ml.data subpackage fix Fixing a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants